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Abstract 

Background: Schizophrenia is a neurodegenerative disorder that occurs worldwide 
and can be difficult to diagnose. It is the foremost neurological disorder leading to 
suicide among patients in both developed and underdeveloped countries. D-amino 
acid oxidase activator {DAOA), also known as G72, is directly implicated in the 
glutamateric hypothesis of schizophrenia. It activates D-amino acid oxidase, which 
oxidizes D-serine, leading to modulation of the N-methyl-D-aspartate receptor. 

Methods: MODELLER (9v10) was utilized to generate three dimensional structures of 
the DAOA candidate gene. The HOPE server was used for mutational analysis. The 
Molecular Evolutionary Genetics Analysis (MEGA5) tool was utilized to reconstruct the 
evolutionary history of the candidate gene DAOA. AutoDock was used for 
protein-ligand docking and Gramm-X and PatchDockfor protein-protein docking. 

Results: A suitable template (1ZCA) was selected by employing BLASTp on the basis 
of 33% query coverage, 27% identity and E-value 4.9. The Rampage evaluation tool 
showed 91.1% favored region, 4.9% allowed region and 4.1% outlier region in DAOA. 
ERRAT demonstrated that the predicted model had a 50.909% quality factor. 
Mutational analysis of DAOA revealed significant effects on hydrogen bonding and 
correct folding of the DAOA protein, which in turn affect protein conformation. 
Ciona was inferred as the outgroup. Tetrapods were in their appropriate clusters with 
bifurcations. Human amino acid sequences are conserved, with chimpanzee and 
gorilla showing more than 80% homology and bootstrap value based on 1000 
replications. Molecular docking analysis was employed to elucidate the binding 
mode of the reported ligand complex for DAOA. The docking experiment 
demonstrated that DAOA is involved in major amino acid interactions: the residues 
that interact most strongly with the ligand C28H28N3O5PS2 are polar but uncharged 
(Gln36, Asn38, Thr 122) and non-polar hydrophobic (Ile119, Ser171, Ser21, Ala31). 
Protein-protein docking simulation demonstrated two ionic bonds and one 
hydrogen bond involving DAOA. Lys-7 of the receptor protein interacted with 
Lys-163 and Asp-2037. Tyr-03 interacted with Arg-286 of the ligand protein and 
formed a hydrogen bond. 
(Continued on next page) 
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Conclusion: The predicted interactions might serve to inhibit the disease-related 
allele. It is assumed that current bioinformatics methods will contribute significantly 
to identifying, analyzing and curing schizophrenia. There is an urgent need to 
develop effective drugs for schizophrenia, and tools for examining candidate genes 
more accurately and efficiently are required. 

Keywords: Schizophrenia, Bioinformatics, Modeling, Docking, DAOA, Phylogenetic 
analysis 



Background 

The nature of a human medical disorder is often elucidated through biological markers 
and behavioral studies. Diagnosis of mental disorders is very difficult because it primar- 
ily relies on behavioral markers. An example of a complex mental disorder is schizo- 
phrenia (SZ), diagnosis of which depends on abnormal behavior such as paranoia, 
dampening of emotions and auditory hallucinations. Genome-wide studies have 
attained a major role in SZ research because high- throughput technologies are valuable 
for discovering relevant genes. SZ is a psychiatric disorder with severe manifestations - 
abnormal behavior, disorganized speech and figments of the imagination - and an esti- 
mated heritability of about 80% [1]. Negative symptoms can also include affective 
flattening, avolition, and alogia. Approximately 1% of the population is affected during 
the course of life. The effects of SZ usually start during the patients late teens to early 
twenties; females have an age of onset five years later than males [2]. A recent meta- 
data analysis estimated the risk of SZ in males to be about 40% higher than in females 
[3]. Epidemiological studies of SZ have shown that it occurs in all populations with a 
prevalence of approximately 1.5-4.5 per thousand and an incidence of 0.17-0.43 per 
thousand [4]. 

According to analysis of gene linkage data and meta-analysis of genome scans [5], 
highly vulnerable genes on chromosomes lq, 3p, 5q, 6p, 8p, llq, 14p, 20q and 22q 
[6,7] contribute to SZ. Both functional and positional candidate SZ genes have been 
studied and various promising candidates that might be involved in risk for the disease 
have been identified. 

The symptoms of SZ have different dimensions that usually occur together and can 
reflect substantial variation among patient phenotypes [8-10]. Different researchers 
have formulated various models of these dimensions but the most widely appreciated 
3D models were first proposed by Bilder et al and Liddle [9,11]. These authors con- 
cluded that the main symptoms are poverty of speech, formal thought disorder, 
decreased voluntary movement, psychomotor impairment, bizarre behavior, hallucina- 
tions, abnormal acts, inappropriate affects, flat affects, flattening, avolition, and alogia. 

A genome-wide association study (GWAS) for SZ was conducted in 2008 but no sig- 
nificant loci were reported, though 7000 samples were used [12,13]. 

The gene DAOA, located on chromosome 13q3, encodes the D-amino acid oxidase 
activator protein, as shown by functional and expression studies. It is significantly asso- 
ciated with SZ and is also known as G72. The D-amino acid oxidase activator (DAOA) 
is directly implicated in the glutamateric hypothesis of SZ [14]. When D-amino acid 
oxidase is activated, D-serine is oxidized and the product modulates the N-methyl-D- 
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aspartate receptor. Modulation of this receptor leads to the cause of SZ; glutamate sig- 
naling is involved in important pathways directly linked to SZ [15]. 

DAOA is also involved in other psychotic disorders and can modify the cognitive and 
negative symptoms of mood. It could be the primary genetic cause of the observed 
overlap of phenotypes between bipolar disorder and SZ [16]. 

Bioinformatics has been used for in silico analysis of biological queries using math- 
ematical and statistical techniques. X-ray and NMR techniques are expensive and time- 
consuming for structural modeling of proteins. Screening of small chemical compounds 
against target receptors by high throughput screening (HTS) is very expensive. 

In this work, we predicted the 3D structure and the protein-ligand and protein- 
protein docking of DAOA using different bioinformatics strategies. The main aim of 
our research was to predict the 3D structure and docking. The objective of the present 
study was to elucidate the interactions of DAOA protein with ligands and other pro- 
teins and to identify the connection of DAOA to SZ. Protein-protein docking and inter- 
action simulations reveal hydrogen and ionic bonds. The present work was conducted 
to provide molecular insights into the structure of the protein and to find its most 
plausible function. 

Results 

This paper describes the implementation of an in silico technique to recruit and analyze 
DAOA, the most likely candidate gene for SZ. The direct involvement of DAOA in dis- 
ease pathogencity has already been reported in several research studies on SZ. 

Initially, a literature search was conducted to explore the most likely candidate gene 
involved in SZ. A comparative modeling technique (MODELER 9vl0) was adopted to 
predict the three dimensional structure of the protein encoded by the selected gene. 
The protein data bank (PDB) was checked for the 3D structure of the selected protein, 
and it was confirmed that no 3D structure had been predicted to date. To check the 
quality and reliability of the predicted model, the evaluation tools ERRAT and Rampage 
were used. 

Protein-ligand and protein-protein docking of DAOA were simulated. The ZINC and 
PubChem databases were used to retrieve the ligand and STRING was used to identify 
protein interactions [17]. 

DAOA has been mapped on chromosome 13, with starting and ending base pairs 
06118216 and 10143383 respectively. Homology modeling was implemented to gener- 
ate the 3D structure of the encoded protein. MODELER 9vl0 was used to construct 
the protein model. A basic local alignment technique (BLAST) was utilized to identify 
the homology between the target protein and its template. The lowest energy 
minimization value for the predicted structure was selected for further analysis. 

The 3D structure or modeling of DAOA is not known and no structural information 
can be found for the templates. The amino acid sequence of DAOA in FASTA format 
was retrieved from Uniprot with accession number A2T115. Table 1 lists the three tem- 
plates 1ZCA, 1V30 and 2E5K with optimal alignment of the first template and good 
alignment for the others, sorted by overall quality, query coverage, similarity and E- 
values. The structure predicted by MODELLER 9vl0 with the alpha helices and beta- 
pleated sheets visualized by Chimera 1.6 is illustrated in Figure 1(A). Figure 1(B) 
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Table 1 Templates for DAOA sorted by their overall quality query coverage, identity and 
E-values 



Accession ID 


Total score 


Query coverage 


E-value 


Max-identity 


1ZCA 


26.9 


33% 


4.9 


27% 


1V30 


30.8 


24% 


0.16 


45% 


2E5K 


26.9 


21% 


2.5 


41% 



demonstrates a superimposition of structure and template. The predicted structure is 
evaluated in Figures 2 and 3. 

Mutations in DAOA have been reported so text mining was used to retrieve them. 
The Arg30Lys mutation is directly involved in SZ. Arginine in the wild type protein is 
replaced with Lysine at position 30, in a highly conserved part of the amino acid se- 
quence. This mis-sense mutation occurs in rare cases but damages the protein when it 
occurs. The polypeptide backbone is not affected but the side chains of the two amino 
acids are distinct. Every amino acid has its own specific charge and hydrophobicity 
value. The mutated and wild type proteins differ in these properties. The mutated 
amino acid is also smaller than the wild type residue, affecting its interaction with other 
molecules. 

A dataset of proteins for a huge range of invertebrate and vertebrate genomes is cur- 
rently available for analyzing the phylogenetic history of the SZ candidate gene. The 
phylogenetic neighbor-joining (NJ) tree presented in this study reveals many interesting 
characteristics of the candidate gene in vertebrates (Figure 4). The phylogenetic history 
of the gene was analyzed by including the protein sequences from teleosts and tetra- 
pods in the tree; amphioxus sequences were also used as the closest invertebrate rela- 
tive to vertebrates. 
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Autodock VINA docking software was used to investigate how the ligand binds to 
the respective protein, the binding conformation, functionally interacting residues and 
best structural information. The ligand retrieved for DAOA is described in Table 2 and 
its structure is illustrated in Figure 5 (a and b). Docking of different ligand conforma- 
tions to the protein was simulated. About 50 complexes were generated by the soft- 
ware, and the one with the lowest binding energy was selected for further analysis. The 
docked complex of DAOA and 0,0 -diethyl thiophosphoryl (Z)-2-(2-aminothiazol-4-yl)- 




20 40 60 80 100 120 

Residue # (window center) 

Figure 3 ERRAT gave a quality factor of 50.909% to the DAOA model. 

V J 
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Figure 4 The evolutionary history of the DAOA gene was constructed. No paralogs of DAOA had been 
reported in biological databases. ENSEMBL Blastp/Blat was also performed for paralogs of DAOA but with 
the same result. Only orthologs of DAOA were used to construct the phylogenetic tree. Complete-deletion, 
p-distance and Uncorrected options were used. Bootstrap values are given as numbers of branches (based 
on 1000 replications) supporting that branch; values of >80% are presented. Amino acid substitution rate 
per site is shown by the scale bar. Eight orthologs were used and Ciona was the outgroup. Two main 
clusters were generated. Human, Gorilla and Chimpanzee lie in the same cluster and show sequence 
similarity. Tetrapods are also closely related to each other in the analysis. Organisms in the same clusters 
show the highest similarity with point mutations. 



2-trityloxyiminoacetate is illustrated in Figure 6. Figure 7(A) illustrates the docked 
DAOA complex as revealed by the VMD visualization tool Figures 7(B) and (C) illus- 
trate the protein-ligand post-docking analysis by Chimera and LigPlot respectively. 

The amino acids present at the active site of the protein were identified by observing 
the residues within 4A using VMD visualizing software. The hydrogen, hydrophobic 
and ionic interactions of the protein-docked ligand complex are described in Table 3. 

GRAMM-X and PatchDock were utilized to characterize the protein-protein docking 
of DAOA. The functional interacting partners of DAOA were retrieved from the 
STRING database. Figure 8 reveals the proteins that interact most closely with DAOA; 
DAO showed the highest interaction score (0.953). The 3D structure of DAO was 
retrieved from the PDB with accession number 2E48. The interaction between DAOA 
and DAO is illustrated in Figure 9. Table 4 lists the interactions and distances between 
the receptor and ligand proteins. 



Discussion 

SZ is a heterogeneous neural disorder with a spectrum of symptoms such as decrease 
in volition, displays of antisocial behavior, disordered sense of self, psychomotor slow- 
ing and alteration in perception. It has been accepted that SZ is caused by multiple or 
single dysfunctions within discrete brain regions. Numerous mechanisms and models 



Table 2 Ligand of DAOA protein was retrieved from the PubChem Ligand database 



Receptor 


Ligand 


Accession 


Ligand Name 


Formula 


Protein 


Database 


Number 






DAOA 


PubChem 


19894 


DAOA:0,0-Diethylthiophosphoryl (Z)-2-(2- 


C 28 H 28 N 3 0 5 PS 2 








aminothiazol-4-yl)-2-trityloxyiminoacetate 
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have been proposed for the development of SZ in terms of the timing, situation and na- 
ture of brain changes but the exact mechanisms are still not understood. 

No three-dimensional structure of DAOA was reported in the Protein Data Bank or 
resolved by X-ray crystallography and NMR. MODELLER 9vl0 was employed for com- 
parative modeling. Suitable templates for DAOA were identified by running BLASTp, 
which retrieved three options: 1ZCA, 1V30 and 2E5K. Among these, 1ZCA showed the 
highest query coverage. The model was predicted by all the templates and 1ZCA 
showed good evaluation results: 91.1% of the predicted amino acids fell in the favored 
region and 4.9% in the allowed region. Only four amino acids were outliers. ERRAT 
revealed a quality factor of 50.909%. 

The gene DAOA was subjected to mutational analysis and mutations were extracted 
from biological databases and the literature. Only one mutation of DAOA has been 
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(C) 

Figure 7 (A) Amino acid residues of active site of DAOA binding with DAOA: O ligand. The VMD 

visualizing tool helped in the visualization and analysis of the protein-ligand docking results. (B) Chimera 
visualization of DAOA and ligand. (C) LigPlot analysis of ligand and DAOA. 



Table 3 Binding interactions of receptor protein DAOA with ligand 


Receptor Protein Hydrogen Bonding O-H Ionic Bonding N-0 


Hydrophobic Interactions C-C 


Amino Acids 


Dist Amino Acids 


Amino Acids 


Dist 


DAOA 




lie 119:CB 


3.80 






UNK0:C 








Ser 71:CB 


3.89 






UNK0:C 




ASN38:OD1 


3.45 No ionic bonds 


Ser 121:CB 


3.56 


UNK0:H 




UNK0:C 




Gin 36: 0 


2.85 


Thr 122:CG2 


3.78 


UNK0:H 




UNK0:C 








lie 119:CG1 


3.84 






UNK0:C 








Ala 31:CA 


3.96 






UNK0:C 
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GRM3 




NRG 1 



DAO 



Figure 8 Interaction network of DAOA showed closely interacting proteins. DAO protein showed the 
closest interaction with DAOA with score 0.953; 3D structure is available in PDB. 



reported and authenticated in biological databases: the Arg30Lys mis -sense mutation, 
which affects protein function because the mutant protein is smaller than wild type 
and the region around residue 30 is conserved. The mutated protein has abnormal 
function and its binding and interactions with other proteins are affected, including 
components of signaling pathways. 

A phylogenetic tool (MEGA5) was used to construct a neighbor- joining tree of the 
selected candidate gene to determine its evolutionary history among human, mammals, 
tetrapods, primates, birds, teleosts and rodents. DAOA has no reported paralogs in bio- 
logical databases (ENSEMBL and UCSC) or in the literature. Orthologs of DAOA were 
used for tree construction. Ciona was inferred as outgroup in the trees. DAOA showed 
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Table 4 Binding interactions for DAOA and DAO 



Receptor 
Protein 


Interacting 
Protein 


Interactions (Receptor protein — > Interacting 
Protein residue) 


Bond 
Distance 


Interactions 
Type 


DAOA 


DAO 


Tyr-03/OH -> Arg-286/O 


2.5 


Hydrogen 
Bonding 






Lys-7/O^Lys-163/NZ 


3.5 


Ionic Bonding 






Lys-7/NZ^Asp-2037/OD1 


3.3 


Ionic Bonding 



a conserved sequence with primates having few mutations. It has evolved successively 
from ciona to human. 

Protein-ligand docking for DAOA was simulated using AutoDock software. The lig- 
and for receptor- ligand docking was retrieved from different databases, commonly 
PubChem and ZINC. DAOA: O, O-Diethylthiophosphoryl (Z)-2-(2-aminothiazol-4-yl)- 
2-trityloxyiminoacetate was used for the protein-ligand docking study. The complex 
with the lowest docked binding energy was selected for post-docking analysis using 
VMD software (version 1.9.1) and interactions between receptor and ligand were ana- 
lyzed. In particular, amino acid residues located within 4 A of the ligand were identified 
and their prospective interaction types were inferred. Eight interactions were observed 
between receptor protein and ligand. No ionic interaction was found but there were 
two hydrogen bonds between an oxygen atom of Asn-38 and a hydrogen in the ligand, 
and an oxygen of Gln-36 in the receptor protein and a hydrogen in the ligand. The 
bond distances between Asn-38 and Gln-36 and the ligand were respectively 3.45 A 
and 2.85A. Six hydrophobic interactions were observed between carbon atoms of the 
receptor protein and ligand. A carbon in Ile-119 interacted hydrophobically with a car- 
bon in the ligand at 3.80 A bond distance. Another carbon atom of Ile-119 also 



Table 5 Summary of tools used 



Sr.No. 


Tools/Databases 


Output/Function 


1 


BioGPS 


Expression profiling 


2 


GeneCards 


Expression profiling 


3 


UniProt 


Protein Data 


4 


MODELLER 


Structure prediction 


5 


AutoDock 


Docking Analysis 


6 


HOPE server 


Mutational analysis 


7 


STRING 


Protein-Protein Interactions 


8 


PubChem 


Ligand Retrieval 


9 


ChemDraw 


Ligand Drawing 


10 


GRAMM-X 


Protein-Protein Docking 


11 


ENSEMBL 


Retrieval of Phylogenetic Sequences 


12 


VMD 


Binding interactions of docked protein-ligand complexes 


13 


PyMOL 


Binding interactions of docked protein-protein complexes 


14 


MEGA 


Phylogenetic Analysis 


15 


Chimera 


Visualization, Superimposition, Interaction 


16 


LigPlot 


For interaction 
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interacted with a ligand carbon at 3.84 A bond distance. Ser-121 and Ser-71 of the re- 
ceptor protein also showed hydrophobic bonding with carbon atoms of the ligand with 
bond distances 3.56 A and 3.89 A respectively. Carbon atoms of Ala-31 and Thr-122 
also interacted with the ligand and had 3.96 A and 3.87 A bond distances. 

GRAMM-X and PatchDock were utilized for DAOA protein-protein docking. The 
PyMol visualization tool was used for post-docking analysis of protein-protein interac- 
tions. DAOA and its functionally interacting partner DAO showed two ionic interac- 
tions: an oxygen atom in Arg-286 of the ligand interacted with a hydrogen in Tyr-03 
with bond distance 2.5 A; and a nitrogen atom in Lys-163 of the ligand interacted with 
an oxygen in Lys-7 with bond distance 3.5 A. An oxygen atom in Asp-2037 of DAO 
formed an ionic bond with a nitrogen atom in Lys-7 of DAOA with bond distance 
3.3 A. 

Conclusion 

For receptor-ligand interactions, both functional and expressional studies show that the 
product of DAOA interacts with the enzyme D-amino acid oxidase and modulates its activity. 
Glutamate signaling is involved in important pathways directly implicated in SZ [15]. The lig- 
and examined here (C28H28N3O5PS2) can be used as a biomolecule, and in vivo experiments 
could be performed in mice to check its effects and interactions, with a view to developing 
an approved drug for SZ. More than 80% homology between human and primates is strong 
evidence for an ancestral relationship that will help in predicting protein function and family. 
Our research suggests a baseline for the design, development and validation of novel drugs 
against SZ. 

Materials & methods 

The amino acid sequence of DAOA (125 residues) was used for homology modeling 
since it is involved in SZ. The sequence was retrieved in FASTA format from Uniprot 
Knowledge base with accession number A2T115. 

The retrieved amino acid sequence of DAOA was subjected to a protein-protein 
BLAST (BLASTp) search against the Protein Data Bank (PDB) to identify a suitable 
template structure for comparative modeling. [PDB ID: 1ZCA] was selected as a suit- 
able template with query sequence having 27% identity, 33% query coverage and E- 
value 4.9. The automated protein modeling program MODELLER 9vl0 was used to 
generate models. It predicted the 3D structure of the protein by satisfying spatial 
restraints [18]. The evaluation tools Rampage and ERR AT were applied to assess the 
predicted 3D model of DAOA. Rampage generated a Ramachandran plot and ERRAT 
evaluated the quality of the predicted structures. 

The mutation was retrieved from the HGMD biological database. The HOPE server 
was used for mutational analysis. To investigate ancestral relationships, the most popu- 
lar software for phylogenetic analysis, the Molecular Evolutionary Genetic Algorithm 
(MEGA 5), was used on DAOA. The distance-based approach was applied using Neigh- 
bor-Joining, considering a bootstrap value of 1000 replications to construct the phylo- 
genetic tree. 

Blind docking was simulated to identify the specific binding site for receptor-ligand 
and protein-protein interactions. The coordinates of the ligand molecule [Accession 
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number PubChem: 19894] were obtained from PubChem. The SDF (Sql Database File) 
format of the ligand was transformed into .pdb format using Chem Draw Ultra Version 
8.0 [19]. The .pdb formats of protein (receptor) and ligand were used as input files to 
run AutoDock Vina. To determine the binding affinities between DAOA and the 
selected ligand, a flexible automated docking program was applied to the active site of 
the protein using AutoDock Vina. AutoDock has a grid map to aid the actual docking 
process. The dimensions of the grid were 40 x 40 x 40 points for the ligand with 0.375 
A spacing between the grid points. 

The STRING server was used to assess the protein interactions of DAOA. It is an on- 
line database of known and predicted protein interactions including direct (physical) 
and indirect (functional) relationships. Protein docking of DAOA with its interactive 
protein DAO was simulated using PatchDock and Gramm-X. Visual Molecular Dynam- 
ics (VMD) software with a table of all tools utilized was used to visualize assay and 
post-docking analysis for protein-ligand docking; PyMol software was used for protein- 
protein docking. The tools employed in this study are listed in Table 5. 
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